Crop model with R and {tidyverse}
Part 2

Introduction

  • In this tutorial, we will see how we can perform an uncertainty analysis and a sensitivity analysis for a crop model

  • Based on the result of the sensitivity analysis, we will then see how we can optimize the model, to reduce the gap between the previsions and the observations

  • This tutorial follows this lesson, where we learned how to code a crop model with R

Introduction

As weather input, we will use the same data set as in the previous tutorial:

library(tidyverse)

day_sowing<-92                # Sowing after 1st May
day_harvest<-287              # Harvest ~ mid-october

data<-readr::read_delim("https://raw.githubusercontent.com/BjnNowak/CropModel/main/Weather_DesMoines.csv",delim=";")%>%
  arrange(LST_DATE)%>%
  mutate(day_number = row_number())%>%
  select(day_number,tas = T_DAILY_MEAN,rsds = SOLARAD_DAILY)%>%
  filter(day_number>=day_sowing, day_number<=day_harvest)

# Check first lines
head(data)
# A tibble: 6 × 3
  day_number   tas  rsds
       <int> <dbl> <dbl>
1         92   9.7 19.4 
2         93  13.6 10.2 
3         94   4.3  1.72
4         95   3.1 18.1 
5         96   7.7 22.2 
6         97  13.2  4.78

Introduction

Previously, we only coded the first part of the model, to determine the number of leaves for corn. In order to save some time, I wrote a script with the rest of the equations, that can be loaded with the source() function as shown below:

# Load script from github
source('https://raw.githubusercontent.com/BjnNowak/Lessons/main/scripts/crop_model_function.R')

# Take a look at the fun_model() function
fun_model
function (name, data, GDD_1leaf, C, RUE, nthresh) 
{
    max_nleaf <- 20
    T0 <- 6
    f <- 0.5
    frac <- 0.7
    model <- data %>% dplyr::mutate(TT = dplyr::case_when(tas < 
        T0 ~ 0, tas >= T0 ~ tas - T0)) %>% mutate(GDD = cumsum(TT)) %>% 
        mutate(pot_nleaf = GDD/GDD_1leaf) %>% mutate(nleaf = case_when(pot_nleaf <= 
        max_nleaf ~ round(pot_nleaf), pot_nleaf > max_nleaf ~ 
        max_nleaf)) %>% mutate(PAR_inc = f * rsds) %>% mutate(APAR = PAR_inc * 
        (1 - exp(-C * nleaf))) %>% mutate(NPP = RUE * APAR) %>% 
        mutate(biom = cumsum(NPP)) %>% mutate(NPPgrain = case_when(nleaf < 
        nthresh ~ 0, nleaf >= nthresh ~ frac * NPP)) %>% mutate(grain = cumsum(NPPgrain)) %>% 
        mutate(grain_t = grain/100) %>% add_column(Scenario = name)
    return(model)
}

Model structure

The arguments of the fun_model() function are the following:

name           # Scenario name 
data           # Climatic variables inputs
GDD_1leaf      # Phyllochron
C              # Crops’s capacity to intercept light 
RUE            # Radiation use efficiency (gDM.MJ-1)
nthresh        # Max number of leaves 

Yield prediction

We will now predict the maximum potential corn yield for our climate conditions, unsing the fun_model() function:

baseline <- fun_model(
  name="DesMoines", 
  data=data, 
  GDD_1leaf = 50,
  C=0.12,
  RUE=2,
  nthresh = 16
)

head(baseline)
# A tibble: 6 × 15
  day_number   tas  rsds    TT   GDD pot_nleaf nleaf PAR_inc  APAR   NPP  biom
       <int> <dbl> <dbl> <dbl> <dbl>     <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl>
1         92   9.7 19.4    3.7   3.7     0.074     0    9.68     0     0     0
2         93  13.6 10.2    7.6  11.3     0.226     0    5.09     0     0     0
3         94   4.3  1.72   0    11.3     0.226     0    0.86     0     0     0
4         95   3.1 18.1    0    11.3     0.226     0    9.06     0     0     0
5         96   7.7 22.2    1.7  13       0.26      0   11.1      0     0     0
6         97  13.2  4.78   7.2  20.2     0.404     0    2.39     0     0     0
# ℹ 4 more variables: NPPgrain <dbl>, grain <dbl>, grain_t <dbl>,
#   Scenario <chr>

Yield prediction

Daily grain yield previsions (in tons) are stored in the yield_t column

The results may be plotted as follows:

p1<-ggplot(data=baseline,aes(x=day_number,y=grain_t))+
  geom_line()+
  labs(  
    title = "Corn yield estimation for DesMoines",
    x = "Day number",
    y = "Potential max yield (t.ha-1)"
  )+
  theme_light()

p1

Yield prediction

If you are only interested in the final yield, it may be extracted as follows:

# Extract final yield
baseline%>%
  tail(1)%>%
  pull(grain_t)
[1] 13.63428

Uncertainty analysis

  • An uncertainty analysis is used to assess the variability of the previsions of the model

  • An uncertainty analysis provides a confidence interval for model predictions

  • To do this, the variability of all relevant parameters is considered simultaneously

Uncertainty analysis

  • For this case study, we will only focus on the variability of two parameters here:

    • C, which reflects the crops’s capacity to intercept light. It depends on the light extinction coefficient (reflecting light penetration in crop’s foliage), the individual leaf area, and the plant density. We will will consider here that C may vary between [0.06; 0.18] (ie up to 18% of light intercepted) ;


    • the Radiation Use Efficieny (RUE), in gDM.MJ-1, which estimates the conversion of the intercepted radiation into biomass. We will will consider here that RUE may vary between [1; 3].

Model structure

  • For this case study, we will only focus on the variability of two parameters here:

    • C, which reflects the crops’s capacity to intercept light;


    • the Radiation Use Efficieny (RUE), in gDM.MJ-1, which estimates the conversion of the intercepted radiation into biomass.

Uncertainty analysis

Now that we have set the variability limits for these parameters, let’s create a data table with random combinations of the parameters in their variability range

N=100                       # Number of draws

analysis<-tibble(
  # Column with C parameter values
  C_draw=runif(
    N,                      # Set number of draws
    0.06,                   # Min value
    0.18                    # Max value
  ),
  # Column with RUE parameter values
  RUE_draw=runif(
    N,1,3
  ),
  # Empty column for corresponding final yield values
  # (to be filled later)
  yield=0
)

Uncertainty analysis

We fill now complete the final yield values corresponding to each {C,RUE} combination with the fun_model() function, using a for loop

for (i in 1:N){
  
  analysis$yield[i]<-fun_model(
      name="DesMoines", 
      data=data, 
      # C and RUE combination based on row index
      C=analysis$C_draw[i],
      RUE=analysis$RUE_draw[i],
      # For this example, GDD and number of leaves are stable
      GDD_1leaf = 50,
      nthresh = 16
    )%>%
    # Extract final yield only
    tail(1)%>%
    pull(grain_t)
    
}

Uncertainty analysis

We may now plot the result of our uncertainty analysis

Considering the uncertainty of the parameters C and RUE, the final yield lies within the following limits:

p2<-ggplot(analysis,aes(y=yield,x=1))+
  geom_boxplot()+
  geom_jitter()+
  labs(  
    title = "Uncertainty analysis",
    subtitle="Each point corresponds to one simulation",
    x="",
    y = "Potential max yield (t.ha-1)"
  )+
  theme_light()

p2

Sensitivity analysis

  • A sensitivity analysis is used to determine the effect of uncertainty about the variables or parameters of a model on the final result

  • A sensitivity analysis identifies the parameters that have the greatest influence on the final model result

  • To do this, the variability of each relevant parameters is considered separately

Sensitivity analysis

We will start by initiating data table for both parameters C and NUE

N=100                       # Number of draws

analysis_C<-tibble(
  C_draw=runif(N, 0.06, 0.18),
  yield=0
)

analysis_RUE<-tibble(
  RUE_draw=runif(N,1,3),
  yield=0
)

Sensitivity analysis

Then we will fill these tables with a for loop

for (i in 1:100){
  # Analysis for C
  analysis_C$yield[i]<-fun_model(
    name="DesMoines", data=data, 
    # Variable parameters
    C=analysis_C$C_draw[i],
    # Stable parameters
    GDD_1leaf = 50,RUE=2,nthresh = 16
  )%>%tail(1)%>%pull(grain_t)
  
  # Analysis for RUE
  analysis_RUE$yield[i]<-fun_model(
    name="DesMoines", data=data, 
    # Variable parameters
    RUE=analysis_RUE$RUE_draw[i],
    # Stable parameters
    GDD_1leaf = 50,nthresh = 16, C=0.12
  )%>%tail(1)%>%pull(grain_t)
  
}

Sensitivity analysis

We may now plot the result of our sensitivity analysis (using scale() to plot the results for both parameters on a relative scale)

p3<-ggplot()+
  geom_line(
    analysis_C,mapping=aes(x=scale(C_draw),y=yield),
    color='blue'
  )+
  geom_line(
    analysis_RUE,mapping=aes(x=scale(RUE_draw),y=yield),
    color='red'
  )+
  labs(                                  
    title = "Sensitivity analysis",
    subtitle = "C (blue) and RUE (red)",
    x = "Parameters variation (relative scale)",
    y = "Potential max yield (t.ha-1)"
  )

p3

Model optimization


Which parameter should be optimized first to improve model performance?